fix(daemon): include debug context in OTEL metrics log#246
Conversation
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request refines the debug logging for OpenTelemetry metrics processing within the daemon. It enhances diagnostic capabilities by adding structured details, such as the count of resource metrics and the processor's debug state, to the existing debug logs. The primary goal is to provide more context for troubleshooting without increasing the verbosity level beyond debug. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request aims to improve debug logging for OTEL metrics processing by adding more context. However, the implementation contains an incorrect usage of the slog library, which will result in malformed log entries. I've provided a specific comment with a code suggestion to correct the logging call.
| // ProcessMetrics receives OTEL metrics and forwards to backend immediately | ||
| func (p *AICodeOtelProcessor) ProcessMetrics(ctx context.Context, req *collmetricsv1.ExportMetricsServiceRequest) (*collmetricsv1.ExportMetricsServiceResponse, error) { | ||
| slog.Debug("AICodeOtel: Processing metrics request", "resourceMetricsCount", len(req.GetResourceMetrics())) | ||
| slog.Debug("AICodeOtel: Processing metrics request", "resourceMetricsCount", slog.Int("len", len(req.GetResourceMetrics())), slog.Bool("debug", p.debug)) |
There was a problem hiding this comment.
The current usage of slog.Debug is incorrect and will result in a malformed log entry with a !BADKEY attribute. The slog.Bool attribute is not preceded by a string key, which causes slog to report an error in the log output. To fix this and create the intended nested structure for resourceMetricsCount while also including the debug flag, you should use slog.Group.
| slog.Debug("AICodeOtel: Processing metrics request", "resourceMetricsCount", slog.Int("len", len(req.GetResourceMetrics())), slog.Bool("debug", p.debug)) | |
| slog.Debug("AICodeOtel: Processing metrics request", slog.Group("resourceMetricsCount", slog.Int("len", len(req.GetResourceMetrics()))), slog.Bool("debug", p.debug)) |
| // ProcessMetrics receives OTEL metrics and forwards to backend immediately | ||
| func (p *AICodeOtelProcessor) ProcessMetrics(ctx context.Context, req *collmetricsv1.ExportMetricsServiceRequest) (*collmetricsv1.ExportMetricsServiceResponse, error) { | ||
| slog.Debug("AICodeOtel: Processing metrics request", "resourceMetricsCount", len(req.GetResourceMetrics())) | ||
| slog.Debug("AICodeOtel: Processing metrics request", "resourceMetricsCount", slog.Int("len", len(req.GetResourceMetrics())), slog.Bool("debug", p.debug)) |
There was a problem hiding this comment.
🟡 Mixing slog string key with slog.Attr value produces malformed log attribute
On line 83, "resourceMetricsCount" is a string key whose value becomes slog.Int("len", ...) — an slog.Attr. When slog encounters a string key followed by an slog.Attr as the value, it nests the Attr, producing an output like resourceMetricsCount.len=3 instead of the intended resourceMetricsCount=3.
Root Cause and comparison with ProcessLogs
The slog.Debug variadic args are parsed as follows: if an arg is a string, it's treated as a key, and the next arg becomes its value. If an arg is an slog.Attr, it's used directly. On line 83:
slog.Debug("...", "resourceMetricsCount", slog.Int("len", len(req.GetResourceMetrics())), slog.Bool("debug", p.debug))"resourceMetricsCount"→ keyslog.Int("len", ...)→ value (anslog.Attrused as a value for the key above, creating a nested/group attributeresourceMetricsCount.len)slog.Bool("debug", p.debug)→ standalone attr, works fine
Compare with the correct pattern on daemon/aicode_otel_processor.go:138 in ProcessLogs:
slog.Debug("...", "resourceLogsCount", len(req.GetResourceLogs()), slog.Bool("debug", p.debug))Here the value for "resourceLogsCount" is a plain int, producing the expected resourceLogsCount=3 output.
Impact: The metrics count log attribute will have a misleading nested key name (resourceMetricsCount.len) instead of the simple resourceMetricsCount seen in ProcessLogs. This degrades observability for daemon diagnostics — the stated goal of this PR.
| slog.Debug("AICodeOtel: Processing metrics request", "resourceMetricsCount", slog.Int("len", len(req.GetResourceMetrics())), slog.Bool("debug", p.debug)) | |
| slog.Debug("AICodeOtel: Processing metrics request", "resourceMetricsCount", len(req.GetResourceMetrics()), slog.Bool("debug", p.debug)) |
Was this helpful? React with 👍 or 👎 to provide feedback.
Codecov Report❌ Patch coverage is
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 2 files with indirect coverage changes 🚀 New features to boost your workflow:
|
Summary
Testing